Search engine

ABSTRACT

The present invention provides for a method of updating an internet search engine database with the results of a user&#39;s selection of specific web page listings from the general web page listing provided to the user as a result of his initial keyword search entry. By updating the database with the selections of many different users, the database can be updated to prioritize those web listings that have been selected the most with respect to a given keyword, and thereby presenting first the most popular web page listings in a subsequent search using the same keyword search entry.

[0001] This application is related to U.S. application Ser. No.60/078199entitled “Improved Search Engine” that was filed on Mar. 16, 1998.

FIELD OF THE INVENTION

[0002] The present invention relates to a method and apparatus thatallows for enhanced database searching, and more particularly, for useas an internet search engine.

BACKGROUND OF THE RELATED ART

[0003] An efficient and practical means of obtaining relevantinformation and also screening unwanted/uninteresting information hasbeen an ongoing need, especially since the inception of the internet.This need is particularly acute at present due to the exponential growthin the number of world-wide web sites and the sheer volume ofinformation contained therein. In an attempt to index the informationavailable on the internet, a number of software search engines have beencreated via which a user enters a search command comprised of suitablekeywords from a keyboard at his personal computer. The search command istransmitted to a server computer, that has a search engine associatedwith the server computer. The search engine receives the search command,and then using it scans for these key words through a database of webaddresses and the text stored on the web sites. Thereafter, the resultsof the scan are transmitted from the server computer back to the user'scomputer and displayed on the screen of the user's computer.

[0004] In order for the search engine to be aware of new web sites andto update its records of existing sites, either the proprietors of theweb sites notify the search engine themselves or the information may beobtained via a ‘web crawler ’ to update the database at the servercomputer. A web crawler is an automated program which explores andrecords the contents of a web site and its links to other sites, therebyspreading between sites in an attempt to index all the current sites.

[0005] This database structure and method of searching it poses somesignificant difficulties. The internet growth-rate has resulted in asubstantial backlog in the scanning of new sites, notwithstanding thefact that web sites are frequently deleted, re-addressed, updated and soforth, thus leaving the search engine with outdated and/or misleadinginformation. Although the web crawlers can be configured to prioritizepossible key-words according to their location (title, embedded link,address etc), nevertheless, depending on the type of search engine used,substantial portions of the web site text (often involving the majorityor even all of the site text) is still required to be scanned. Thisresults in colossal storage requirements for the search engine.Furthermore, a typical key word search may bring up an excessively largevolume of material, the majority of which may be of little interest tothe user. The user typically makes a selection from the list based onthe brief descriptions of the site and explores the chosen sites untilthe desired information is located.

[0006] These results are in the form of a list, ranked according tocriteria specific to the search engine. These criteria may range fromthe number of occurrences of the key-words anywhere within the searchedtext, to methods giving a weighting to key-words used in particularpositions (as previously mentioned). When multiple key-words have beenused, sites are also ranked according to the number of differentkey-words applicable. A fundamental drawback of all these rankingsystems is their objectivity—they are determined according to theprogrammed criteria of the search engine, and the emphasis placed onparticular types of site design, rather than any measure of the actualusers' opinions. Indeed this can lead to the absurd situation whereby inan attempt to ensure a favorable rating by the most commonly used searchengines, some designers deliberately configure their sites in the lightof the previously mentioned criteria, to the detriment of thepresentation, readability and content of the site.

SUMMARY OF THE INVENTION

[0007] It is an object of the present invention to ameliorate theaforementioned disadvantages of conventional search engines byharnessing the cerebral power of the human operator.

[0008] It is a further object of the present invention to provide anovel search engine with enhanced efficiency, usability andeffectiveness with a reduced system storage and/or computationalrequirements in comparison to existing software engines.

[0009] It is a further object of the present invention to provide avariety of indications of the popularity of the search data, togetherwith an indication of its date of creation or updating.

[0010] In order to obtain the above recited advantages of the presentinvention, among others, one embodiment of the present inventionprovides for a method of updating an internet search engine databasewith the results of a user's selection of specific web page listingsfrom the general web page listing provided to the user as a result ofhis initial keyword search entry. By updating the database with theselections of many different users, the database can be updated toprioritize those web listings that have been selected the most withrespect to a given keyword, and thereby presenting first the mostpopular web page listings in a subsequent search using the same keywordsearch entry.

[0011] In another embodiment of the present invention, a method ofdetermining content to provide along with listings transmitted from aserver computer to user sites is provided. In this embodiment, there isobtained a content listing from each one of a plurality of differentdeveloper sites. Each of the content listings includes content, adeveloper identifier, and a keyword, and a keyword selection factor.Thereafter, there is determined a particular keyword from the obtainedkeywords that is the same for different content listings. For thatparticular keyword, the keyword selection factor is used in determiningwhen to transmit different content listings to the user sites.

[0012] In still another embodiment, there is provided a method ofupdating a keyword table with the results of a user's selection ofspecific keywords which were obtained from a list of related keywordspresented to the user. By updating the database with selections of manydifferent users associated with that same keyword, appropriate keywordscan be provided and presented first when that same keyword issubsequently entered.

BRIEF DESCRIPTION OF THE DRAWINGS

[0013] These and other advantages of the present invention may beappreciated from studying the following detailed description of thepreferred embodiment together with the drawings in which:

[0014]FIG. 1 illustrates certain of the overall features of the presentinvention;

[0015]FIG. 2 illustrates various inputs to the search, and, for each ofthe different capabilities, illustrates the outputs that will beprovided engine according to the present invention;

[0016]FIGS. 3A and 3B illustrates an overview of the process by whichweb pages are selected in making up the search results provided to theend user according to the present invention;

[0017]FIG. 4 illustrates the data sets used for different web-pagesearches according to the present invention.

[0018]FIG. 5 shows the various data sets previously described, andvarious inputs and actions that result in a list of suggested web pagesbeing provided according to the present invention;

[0019]FIG. 6 illustrates the implementation of a popular searchaccording to the present invention:

[0020]FIG. 7 illustrates the implementation of a hot off the presssearch according to the present invention:

[0021]FIG. 8 illustrates the implementation of a high-flyers searchaccording to the present invention:

[0022]FIG. 9 illustrates the implementation of a random search accordingto the present invention:

[0023]FIG. 10 illustrates the implementation of a previous pastfavorites search according to the present invention.

[0024]FIG. 11 illustrates the implementation of a collective searchaccording to the present invention.

[0025]FIG. 12 illustrates the implementation of a date created searchaccording to the present invention.

[0026]FIG. 13 illustrates the implementation of a customized searchaccording to the present invention.

[0027]FIG. 14 illustrates the implementation searching based upon agroup identity according to the present invention.

[0028]FIG. 15 illustrates a keyword eliminator feature according to thepresent invention.

[0029]FIG. 16 illustrates the process of determining which searchresults should be used to make up the cumulative surfer trace tableaccording to the present invention.

[0030]FIG. 17 illustrates active suggestion of web pages according tothe present invention.

[0031]FIG. 18 illustrates passive suggestion of web pages according tothe present invention.

[0032]FIG. 19 provides an overview of suggesting keywords according tothe present invention.

[0033]FIG. 20 illustrates the manner of creating data sets for suggestedkeywords according to the present invention.

[0034]FIG. 21 illustrates a variety of manners in which a list ofsuggested keywords can be created according to the present invention.

[0035]FIG. 22 illustrates how content is attached to web page listingsaccording to the present invention.

[0036]FIG. 23 illustrates various content data sets and operations thatpopulate them according to the present invention.

[0037]FIG. 24 illustrates various content data sets and operations thatare used to select data from them a according to the present invention.

[0038]FIG. 25 illustrates web page listings and other content dataaccording to the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

[0039]FIGS. 1A and 1B illustrate certain of the overall features of thepresent invention, which will be described in further detailhereinafter. It is initially noted that like-numbered reference numeralsin various Figures and descriptions will be used in the followingdescriptions to refer to the same or similar structures, actions orprocess steps.

[0040] The present invention is preferably implemented in a networkenvironment wherein each computer contains, typically, a microprocessor,memory, and modem, and certain of the computers contain displays and thelike, as are well known. As shown in FIG. 1B, a plurality of usersites/computers 100A-100D are shown, as are a plurality of servercomputers 102A-B, and developer sites/computers 104A-B. It is understoodthat in a typical internet network, that different server computers 102can be interconnected together, as is illustrated. Further, while only afew user sites, developer sites and server computers are shown, it isunderstood that thousands of such computers are interconnected together.

[0041] While the specific embodiments of the present invention arewritten for applications in which the invention is implemented assequences of coded program instructions operated upon by a servercomputer 102 as illustrated, it will be understood that certainsequences of these program instructions could instead be implemented inother forms, such as processors having specific instructionsspecifically tailored for the applications described hereinafter.

[0042] As will be illustrated hereinafter, additional operations,transparent to the user, are implemented in order to obtain searchresults in the future based upon currently made searches. As shown, thepresent invention has various capabilities, each of which areillustrated in a parallel flow in FIG. 1A, which illustrates an overviewof the different capabilities that can be ongoing simultaneously. Interms of overall capabilities, start block 10 show three: suggesting webpages 12, suggesting keywords 14, and content suggestion 16.

[0043] In order for web pages 12 to be selected by a user according tothe present invention, there is a step 18 in which the type of search tobe performed is selected. Thereafter, in step 20, search input obtainedfrom one of a variety of sources is input and used along with thealgorithm selected in step 18 to determine search results. The resultsof this search are then displayed to the user, as shown by steps ofdisplaying a created list of web pages, displaying passively suggestedweb pages, and displaying actively suggested web pages, identified assteps 22, 24, and 26, respectively, in FIG. 1. This capability, and howit is implemented, will be described in more detail hereinafter.

[0044] In order for keyword suggestion to take place, which the user mayor may not select, there is preferably an initial step 28 in which thetype of keyword search algorithm to use is selected. Although manysystems may have only one such algorithm, various ones, as describedhereinafter, are possible. Once the keyword search algorithm isselected, step 30 follows in which, based upon a keyword entered by auser, the current set of keyword data is operated upon to determineassociated keywords. The results of this operation are then displayed tothe user in 30. This capability, and how it is implemented, will bedescribed in more detail hereinafter.

[0045] The previously mentioned web page and keyword selectioncapabilities inured to the direct benefit of the end user. Another novelfeature of the present invention, which indirectly inures to the benefitof the end user, directly benefits the advertiser, because it allows forcontent to be targeted in real time based upon various criteria. As willbe described more fully hereinafter, a content providing algorithm isinitially selected which will determine how content is selected in step34. Step 36 follows, and based upon inputs from users and contentproviders, which content to show is determined. Thereafter, theadvertisements are displayed for the user to see, simultaneously withthe display of either keywords and/or web pages.

[0046] While FIG. 1 illustrates certain overall features according tothe present invention, many of the advantageous features of the presentinvention are not, as mentioned previously, observable to the user, butinstead transparent to user. They are, however, significant in order tofully explain how the present invention is implemented and are explainedhereinafter.

[0047]FIG. 2 is provided to illustrate various inputs to the searchengine according to the present invention, and, for differentcapabilities, illustrates the outputs that will be provided. Moredetailed explanations are provided hereinafter. Data that is potentiallyinput from search engine user include:

[0048] keyword 52—this is the word or phrase that the user enter to finda list of web pages

[0049] profile types 54—these are the groups of people they associatethem selves with e.g. US, male, doctor etc.

[0050] user ID 56—this is a unique identification for each user thatchooses to register with the search engine. This can be done via acookie or logon.

[0051] search type 58—this can be actively chosen by the searcher todetermine the type of search results they would like (popular, new, etc)

[0052] date-time 60—this is passively recorded when a searcher uses thesystem

[0053] IP address 62—this is passively recorded when a searcher uses thesystem

[0054] other 64—this includes other personalization information such assearch customization preferences, keywords for web page suggestion etc.This information is entered actively once by the user then used topersonalize the search results each time the users (identified by userID) uses the search engine.

[0055] Data from web-page developers include:

[0056] URL 66—this is the URL address of the web page or pages that theywish to submit

[0057] description 68—this is a 2-3 line description of the informationon their web-page

[0058] keywords 70—these are the keywords that the web page developerwould like to associate their web-page with

[0059] target audience 72—these are the target audience (profile types54) that the web page developer particularly want to target.

[0060] date-time 74—this is passively recorded when every a web-pagedeveloper submits a web page

[0061] Data from content providers include:

[0062] bids 76—these are $ bids for contents as described later.

[0063] content details 78 —this includes all details of contentproviders including address, content details etc.

[0064] Results from other search engines 80—these are the results for akeyword search from other existing search engines.

[0065] Outputs of the search engine 10 are: lists of web pages90—depending on the input data a list of web pages can be produced inweb page determination step 82, described further hereinafter; contentkeywords 92—the search engine suggests other keywords for users to tryproduced in key word determination step 84, described furtherhereinafter; and content 94—the search engine sends out selected contentas produced in determine content step 86, described further hereinafter

[0066] To facilitate ease of reference and aid understanding, theaforementioned and subsequently mentioned data-set definitions arereiterated and expanded upon below (and where appropriate, the structureof the dependant data-sets used to create the defined data-set are shownin tabular form) with reference to the preferred embodiment of thepresent invention. Thereafter, certain of these will be explained ineven greater detail to fully teach how to make and use the presentinvention.

[0067] Locations: a plurality of unique information entities.

[0068] Web-pages: Locations in the form of Web-pages URL (UniversalReference Locator) addresses.

[0069] Key-word: The word or phrase that is entered in the search engine

[0070] Hit-list: The list of web-pages (URL addresses) that is theresult of the key-word search. This hit-list ranks the relevance of theweb-pages relative to the key-word. This hit-list always has a key-wordassociated with it. Input data set Output data set Key-word (temporary)Hit-list-Ranked hit-list of Web-pages Database to match the key-word(temporary) with (permanent)

[0071] Permanent data set: Retained long term (although it changes overtime)

[0072] Temporary data set: Created only for the duration of the search

[0073] Surfer trace: This is a measure of how users search. It is atrace of the key words they search for, the URLs subsequently selectedand how long they spend there, from which a ranking of web-pages for ausers (surfers) can be calculated. It is a measure of which web-pagesthey found most useful after the key-word search. The combination of allsurfer traces is used to create a users' choice hit-list. Input data setOutput data set Key-word (temporary) Surfer trace-A list of userweb-pages User selections from initial search users found useful foreach key-word results (temporary), i.e. Web pages (can be permanent ortemporary) visited (URLs) Times spent a each URL IP address of user

[0074] Users' choice hit-list: This a semi-permanent ranking ofweb-pages associated with every key-word and indicates how usefulInternet users found each of the web-pages associated with the key-word.The users' choice hit-list is incrementally updated by a new surfertrace. Input data set Output data set Surfer trace (can be permanent orNew Users' choice hit-list-Ranked temporary) hit-list of “popular”Web-pages Users'choice hit-list (permanent)* (permanent)

[0075] New web-page list: This is a list of new web-pages that iscreated by ULR submissions from web-page developers. When a webdeveloper updates a web-page, they can submit the web-page address,brief information about the page and a list of key-words that thedeveloper decides are relevant. The web-page is then placed on the topof each of the key-word new web-page lists. Input data set Output dataset All web-page developers New web-page list (permanent) informationabout web address and key-words

[0076] Content Provider's list: This is a list (associated with eachkey-word) of content providers which must typically pay to illustratecontent with the key-word. The price paid is dependent on the number ofother content providers, the amount they spend and the number of timesthe key word is searched for. Input data set Output data set Key-wordContent Providers list-a list of Content Provider's bids for contentcontent associated with each key- spots word (permanent)

[0077] High-flyers hit-list: This a list of web-pages (associated withevery key-word) that are increasing in popularity at the highest rate.It is an indication of how rapidly web-pages are rising up the users'choice hit-list and it is used as a means to ensure that new emergingweb-pages rise to the top of the users' choice hit-list. Input data setOutput data set Old Users' choice hit-list- High-flyers hit-list: Aranked list of (temporary) web-pages that are rising in popularity NewUsers' choice hit-list- the fastest (permanent)

[0078] Personal hit-list: This a list of web-pages the individual userhas found most useful for each key-word search they have done in thepast. It is like an automatic book-marking data set for each individualuser. Input data set Output data set Key-word Personal hit-list: Aranked list of web Individual surfer trace-(permanent) pages that anindividual has found useful in the past

[0079] Collective Search hit-lists: This can be a combination of any ofthe above hit-lists. There are many different ways that these hit-listscan be combined. Input data set Output data set Crawler hit-list(temporary) Collective Search hit-lists-(Default) Users' choice hit-list(permanent) Ranked hit-list of Web-pages displayed to the user after thekey- Advertisers' list (permanent) word search. It can be a combinationNew web-page list (permanent) of any of the hit-lists above High-flyerslist (permanent) (temporary) Personal hit-list (permanent)

[0080] Crawler key-word list: This is a list of key-word suggestionsthat the user may find useful. This is found by matching the key-wordentered by the user to the database of key-words and phrases that otherusers have tried. This is the equivalent of the crawler hit-list, thoughit is a ranking of key-words rather than Web-pages. The method for doingthis uses a similar algorithm to a spell-checker only it does it forphrases. It also suggest Key-words, based on previous URL selectionsfrom sequences of user key-words. Input data set Output data setKey-word (temporary) Ranked hit-list of other key-words the Database ofall key-words used user may want to try (temporary) (permanent)

[0081] Surfer key-word list: This is a data set comprised a list ofkey-words that the individual user found useful after the key-word wasselected. This is found by tracking which key-words the user decided touse. This is equivalent to the surfer trace. Input data set Output dataset Key-word (temporary) Ranked list of other key-words Data about whatkey words were (associated with the key-word) that used from thekey-word suggester this individual user found useful (semi-permanent)

[0082] key-word suggester: This is a data set consisting of a permanentranking of other key-words that users have found useful, compiled fromsuccessive surfer key-word lists and is linked to each key-word (this isequivalent of the users' choice hit-list). Input data set Output dataset Surfer key-word list (temp or New users' choice key-word listpermanent) (permanent) Existing users' choice hit list (permanent)

[0083] The discussion provided above provides the language necessary tomore fully describe the present invention. As illustrated in FIGS. 3Aand 3B, which provide an overview of the search engine capabilitiesaccording to the present invention in which web pages are selected inmaking up the search results provided to the end user. In step 112, theuser enters up to 4 sets of data: keyword 52, profile type 54, searchtype 58 and User ID 56. The IP address 62 and date-time 60 are notentered by the user but can be read when a user uses the search engine.This data is used is used in parallel in steps 114 and 116 to producelist of web pages. Step 114, discussed in detail hereinafter, is theprocess of selecting web pages from novel new search engine data setsproduced in accordance with the present invention. This can run, ifdesired, in parallel with step 116 which obtains a selection of webpages from other existing search engines. Thereafter, selection of webpages from step 114 and 116 are combined and tagged instep 118. Theprocess of tagging the list of web pages, described in more detailbelow, enables a set of data, shown as surfer trace data in FIG. 3, tobe created and sent back to the search engine when the search engineuser selects a web-page from the list in step 120. The process ofselecting a tagged web-page creates the following series of data whichis used to update the search engine data sets; keyword 124, URL 126,user ID 128, IP address 130, date-time 132, brief web page description134.

[0084] Although it is preferred to use all of these different data typesin the surfer trace data, use of different combinations of this data isfully within the intended scope of the present invention. Thedescription 134 will typically only be included in the preferredembodiment of the invention when a new site is added to the data set 114of the search engine 10, and the description used will be thatdescription that appears on the original list of web pages. Thedate-time data 132 may only indicate that a site was selected, ratherthan record the period of time a user was at a particular site, asexplained further hereinafter. This process is invisible to the userwho, upon selecting the web-page from the list of web pages is takendirectly to the corresponding URL, step 122. Details of theimplementation of steps 114, 118 and 120 will be described in moredetail hereinafter.

[0085] After the initial selection the user may choose to access anotherof the web-page URL search results. Depending on the relevance of thesite, the user may spend time reading, downloading, exploring furtherpages, embedded links and so forth, or if the site appearsirrelevant/uninteresting, the user may return directly back to thesearch results after a short period. The time difference between the twoselections is recorded as the difference between two date/time data 132from subsequent selections from the list of web page searches (in thisembodiment, one can only measure the time spent at one web page ifanother selection is made after visiting that web page—this thenprovides another surfer trace 132 which allow a time difference to becalculated). This surfer trace data on the popularity of web pages isused to rank the subsequent searches, as described further hereinafter.

[0086] Thus, according to the present invention, it is the human users'powers of reasoning and analysis that is being used to establish therelevance of the different results to the subject matter of the search.The present invention utilizes the cumulative processing and reasoningof all the human users' to provide a vastly more effective means ofobtaining the required information sources than is presently possiblewith the type of method described above.

[0087] As described above, human brain power is captured by recordingwhich web pages the user goes to after each keyword search. According tothe present invention, collecting the surfer trace data is achieved bysending, in the list of web pages generated by the search to the user,hidden links that will automatically send information back to the searchengine (or a subsidiary server). While the user only sees that hisintended link is displayed, the hidden link notifies the search engineof the transfer, which process can be executed with a Java applet. Thus,when the Internet user selects a web-page it takes the user to thataddress but also sends off the surfer trace data to the search engine10, which notes what has been selected. When the user returns to thelist of web pages and selects another web page listing, another Javaapplet is then executed which creates another surfer trace. Thedifference between the data time data in this surfer trace from twosequential selections captures the time period that the user has been atthe previous web site. This occurs without the user knowing this data isbeing sent.

[0088] In another embodiment, rather than using multiple Java applets tocollect a complete list of surfer trace data, there is no descriptiondata 134, and the date-time data 132 indicates that a user visited aparticular web site. In one specific embodiment, the user must visit aparticular web site for greater than a predetermined period of time,such as one minute or fifteen minutes, depending on what is anappropriate time to have looked at the site for the visit to the site tocount and for any surfer trace data to be sent back to the search engine10, as well be described hereinafter. In this embodiment, each appletcontains all of the information necessary to update the database at thesearch engine. Another embodiment collects the surfer trace data priorto a user navigating to the intended web site. Other ways of obtainingthis surfer trace data are possible and are within the intended scope ofthe present invention.

[0089] Thus, the search results page according to the present inventionis therefore differently formatted from conventional search engines'results pages. The difference is in action rather than content.Visually, the page looks the same to the user as standard search resultsfrom other search engines.

[0090] An example illustrates this point: In a conventional search, theresults page for a search of the keyword “Weather” may read:1.www.weather.com Today's weather forecast. Today is expected to be fineand sunny everywhere.

[0091] The HTTP link associated with the “www.weather.com” label is“http://www.weather.com”. This means that if the user selects this link,they will navigate to this page directly

[0092] In contrast, according to the present invention, the taggedresult page for the search made suing the keyword “Weather” may read

[0093] 1. www.weather.com Today's weather forecast. Today is expected tobe fine and sunny everywhere.

[0094] The HTTP link associated with the “www.weather.com” label is“link.asp?n=1.” If the user selects this link, therefore, in a processis invisible to the user, the user is first directed to the link.asppage on the site corresponding to the web server using the search engine10 according to the present invention, and pass parameter n with value1.

[0095] Server side code (application code that runs on the web server)uses this parameter to identify the URL and description of the user'schosen site. This information is then stored in a database Table alongwith other surfer trace data. The server side code then executes aredirect operation to the user's required URL. The user then sees theirrequired page appear.

[0096] The source of search results is independent to this activity. Thedestination page of the user is independent of this activity. Theprocess is one of recording a user, keyword and destination into adatabase. This method of tracking can only record the initial web-pagevisited after a keyword search. If the user continues to return to thesearch results list then subsequent web-page visits can be recorded.

[0097] The surfer trace data that is sent back to the data sets 114 ofthe search engine 10 as a result of the user selecting the web-page canbe encrypted to prevent fraudulent users from sending fake data to thesearch engine.

[0098] Another method of tracking where a user may connect to from aninitial URL selection (if they do not return to the search result page)is to run the selected web-pages as part of a ‘frame’ located at thesearch engine web-site. This permits a complete record of the web pagesvisited to be recorded after a keyword is entered. However, this imposesan additional level of complexity to the system with a possible decreasein system response time.

[0099] As previously mentioned, the surfer trace data that can becollected includes keyword 124, URL 126, user ID 128, IP address 130,date-time 132, brief web page description 134, and is identified as suchsince it provides a trace or record of how searchers (surfers) use thesearch engine. This data is used to improve future searches building onthe preferences of previous searchers. The surfer trace is thus ameasure of the preferred choices of an individual user or web ‘surfers’from the initial search results for a particular set of key-words.

[0100] How the Data Sets are Created that Determine the List of WebPages

[0101]FIG. 4 illustrates the data sets used for different web-pagesearches according to the present invention. The data sets (tables) thatare used to determine the list if web pages include keyword table 164,profile ID table 166, security table 168, cumulative surfer trace table170, keyword URL link table 172, personal link table 174, and web-page(URL) table 188.

[0102] The structure of the aforementioned data sets are described inmore detail hereinafter. The descriptions that follow show the dataarranged in a spreadsheet fashion, with multiple values per cell andmany blank cells. Illustration in this manner is convenient forexplaining the present invention, but is not an efficient storage andretrieval method. As will be apparent to those skilled in the art, arelational database model would be used to implement the data storageaccording to the present invention such that there may be multiplefields or Tables involved to store the data and each field will storeonly one value.

[0103] Keyword Table (164)

[0104] The contents of keyword data table 164 of FIG. 4 are shown inmore detail in Table 1 shown below, and is a list of keywords, includingphrases, and the number of times they have been requested. If the listbecomes unmanageably large, the key-words that are not used again aftera predetermined time period could be deleted from the list. However iswould be desirable to keep the majority or all keyword phrases that areentered, if possible. TABLE 1 List of information requests and thenumber of times it is requests Cumulative number of times the Uniquenumber for each Key-word keyword is requested (W) keyword Key-word 1 W1,W2, W3 etc Key-word 2 Key-word 3 Key-word 4 Key-word 5 Key-word 6Key-word 7

[0105] The cumulative number of times a keyword is requested may besegregated according to the different “users profiles” selected (W1, W2,W3, . . . ), e.g. W1=total searches, W2=male profile, W3=Female profile,W4=USA profile and so forth. It should be noted that the sum of W's willbe greater that the total number of times a site has been visitedbecause the user may fall into more than one profile category e.g. amale-(W2) from the USA (W3). This would become a list of not only thenumber of user searchers using that key-word but also a list of the typeof user (according to the profile type selected) searching for thatkeyword. Keywords that mean the same thing in different languages aredifferent keywords, as long as the spelling is different, although theycould be related using the keyword suggester, as described hereinafter.

[0106] Web-page Table (188)

[0107] The contents of web-page table 188 of FIG. 4 are shown in moredetail in Table 2 shown below, and contains a list of Internetweb-pages. Each web-page has a URL address, an associated 2-3 linedescription, a unique web page number for each URL(which can also be anycharacter, symbol code or representation) and the cumulative number oftimes the URL has been visited. The URL address will have a uniquenumber (which can also be any character, symbol code or representation)assigned to it rather than storing the full URL string in the subsequentdata-Tables. TABLE 2 List of information suppliers and a description ofthe web-page Frequency the URL 2-3 line Unique number for (web page)Address description each URL address is visited URL address 1 URLaddress 2 URL address 3 URL address 4 URL address 5 URL address 6 URLaddress 7 . . .

[0108] Keyword URL Link Table (172)

[0109] The contents of keyword URL link table 172 of FIG. 4 are shown inmore detail in Table 3 shown below. This table is of particularsignificance with respect to the present invention because it containsinformation about the links between information supplies (URL addressesor web pages) and information requests (keywords).

[0110] This data is recorded in further data sets which describes therelationship between the key-words and occurrences as defined by thefollowing three parameters.

[0111] the cumulative number of significant visits (hits) to each URLaddresses corresponding to each key-word (herein referred to as X orweighting factor X). This is a measure of the popularity of the URL foreach keyword and is determine from the surfer traces.

[0112] the previous cumulative number of significant visits measured atan earlier predetermined instant; (herein referred to as Y or weightingfactor Y)

[0113] a date time factor relating to the instant of the creation orinput of each said web-page(herein referred to as Z or weighting factorZ). Z is the data time in which a web-page developer submitted aweb-page to the search engine.

[0114] Not all combinations of key-words and URL addresses will havedata for X, Y and Z. TABLE 3 Links between information suppliers(web-pages) and information requests (key-words) Key- word Key-wordKey-word Key-word Key-word URL address 1 X,Y,Z URL address 2 X,Y,Z URLaddress 3 X,Y,Z URL address 4 X,Y,Z URL address 5 X,Y,Z X,Y,Z URLaddress 6 URL address 7

[0115] Profile Type S with the Keyword URL Link Table

[0116] The popularity of web pages will be different for differentgroups of people. The inclusion of multiple profile type s will producemultiple values of X Y and Z in Table 3, e.g. one may have a Global andNew Zealand popularity rating denoted by X1 X2 Y1 Y2 etc. Keyword“sports” URL address relating to Rugby X1 = 520, X2 = 52 URL addressrelating to Basketball X1 = 4000 X2 = 20

[0117] In this example the global popularity (using the general profiletype ) for the Rugby and Basketball URL addresses are 520 and 4000respectively and 52 and 20 respectively for the New Zealand profiletype.

[0118] When the general profile type setting is used (ranked based onX1), the Basketball site would be ranked at the top. When the NewZealand setting is chosen (ranked based on X2) the rugby site would behighest. This would be a reflection of the preferences of the NewZealanders. This is a very simple method of storing the preference ofdifferent groups of people.

[0119] One would expect New Zealand-based rugby web-sites to rate higherthan an overseas site on the New Zealand list, but there is no reasonthat this has to be the case. Someone in Spain may have the best Rugbysite in the world. The system evaluates web-pages only on the perceivedquality of information by the users—the physical location of the site isimmaterial.

[0120] There could be a vast range of X values representing differentcountries, occupations, sex, age and so forth, enabling, the popularityof different groups to be captured very simply. Users could choose tocombine any of the X values according to their personalinterests/characteristics.

[0121] As an example, if say,

[0122] X1 is for males

[0123] X2 is for females

[0124] X3 is for New Zealanders

[0125] X4 is for USA

[0126] X5 is for engineers

[0127] X6 is for lawyers . . .

[0128] A “male” and a “New Zealander” would using the search engineincrement both X3 and X1. This facility would increase the datarequirement of the system but it could vastly improve the search resultsfor different users. The total popularity of the web-page needs to bestored as a separate number as users may contribute to more than one ofthe groups of people. The sum of all of the individual popularity'swould be greater than the total popularity because user can belong tomore than one profile type.

[0129] To simplify the system for the user there would be a defaultprofile type (selection of X's) with an option is to use other profiletype s to do specific searches. For example, a user may have a defaultprofile type of a New Zealand male, but if a technical search isrequired a “global engineers” profile type may be chosen that reflectsthe cumulative search knowledge of engineers around the world.

[0130] The extent of personalization could be dependent on the frequencyof searching. For example, common keywords such as “news” would have ahigh degree of personalization (a large range of X values) and lesscommon key-word such as “English stamps” would have little or nopersonalization (only a global X value). The degree of personalizationcould be a function of the frequency that the key-word is used (foundfrom Table 1).

[0131] Cumulative Surfer Trace Table (170)

[0132] The contents of cumulative surfer trace table 170 of FIG. 4 areshown in more detail in Table 4 shown below. Information about the linksbetween web pages and keywords in Table 3 (also referred to as keywordURL link table 172) is updated by the surfer trace data. The cumulativesurfer trace is the combined information from all individual surfertraces and it is used to determine how many “hits” (significant visits)each web-page had for each key-word.

[0133] The information collected from each individual surfer trace is aseries of inputs previously described, and shown below in Table formTABLE 4 Each row is one surfer trace and the combined rows are thecumulative surfer trace IP Number User ID Keyword URL (webpage)Date-time

[0134] The way the surfer trace data is processed to update Table 3 isdescribed further hereinafter.

[0135] Profile ID Table (166)

[0136] The contents of profile ID table 166 of FIG. 4 are shown in moredetail in Table 5 shown below. This table includes a uniqueidentification, password, contact email and a default profile type whichthey normally use to perform their searches. TABLE 5 User identificationTable User Default Other identification password email profileinformation Joe Bloggs dogs jbloggs@AOL U.S., Male

[0137] The users default profile type is stored as the part of theuser's personal preferences profile, which would accessed by enteringsome form of personal identification to the system. This informationcould be supplied when logging on to the data search engine or thesearch engine could leave a “cookie”, as that term is known in the art,on the computer to identify a user, (there would be an optional e-mailaddress and password (or similar) associated with the logon procedure).The IP address itself would not be a sufficient means of identificationas it is not necessarily unique to the individual users.

[0138] The other information can include user defined preferences forhow the search results are combined and keywords that are of particularinterest to the user. This information can be used to actively customizethe search results and suggestions of web pages to visit.

[0139] Personal Link Table (174)

[0140] The contents of personal link table 174 of FIG. 4 are shown inmore detail in Table 6 shown below. Table 6 is identical in structure asTable 3, and can be used to record a users personal preferences relatingto each URL including the number of times visited and the key-words. Inthis Table 6, however, Z is not the date that the web-page developersubmitted the web-page by it is the date-time that the user visited theweb page. This allow the users could refine a search by defining thelast time they visited the web page. TABLE 6 Links between informationsuppliers (web-pages) and information requests (key-words) for anindividual user Key- word Key-word Key-word Key-word Key-word URLaddress 1 x,y,z URL address 2 x,y,z URL address 3 x,y,z URL address 4x,y,z URL address 5 x,y,z x,y,z URL address 6 URL address 7

[0141] The data in Table 6 is only accessed by the individual thatcreated it, and accessible using a user ID that is preferablyindependent of changes in the user's e-mail or IP address changes andwould thus enable their past personal preferences to be retained duringsuch changes.

[0142] This Table 6 data set could be stored either at the search enginesite or on an individual's computer. Storing on local PC's would requireadditional software to be installed on the users computer. There arenumerous advantages to storing the information at the search engineincluding the fact that users are likely to go there more often andunlikely to change search engines once they have a substantial book marklist.

[0143] Security Table (168)

[0144] The contents of security table 168 of FIG. 4 are shown in moredetail in Table 7 shown below. To ensure that users do not submit thesame key-word over and over to increase its popularity the followingsecurity data table is used. Each entry is a single piece of informationi.e. yes or no. This table can be created for links between keywords andIP addresses or links between keywords and User ID's. TABLE 7 SecurityTable to ensure one computer user does not submit keywords toartificially boost the popularity of a web-page Key-word 1 Key-word 2Key-word 3 Key-word 4 IP address 1 1 IP address 2 1 IP address 3 IPaddress 4 1 IP address 5 1

[0145] Described hereinafter are the processes that are used by thepresent invention to populate each of the FIG. 4 tables mentionedpreviously.

[0146] Populating the Keyword Table 164

[0147] This table is populated every time a user enters a keyword 52 tothe search engine. A submitted keyword is compared to the keyword listin Table 1 (keyword table 164) and added if it is not already present.If it is present, the cumulative number is increased by one. If the userhas a profile type then the cumulative number for the keyword for eachtype of profile will also be incremented (W1, W2 W3 etc).

[0148] Populating the Web-page Data Table (URL Table) 188

[0149] This table is populated in a number of ways, including:

[0150] user selecting a URL address 126 that is not already in Table 2(URL table 188). The URL address 126 and description 134 are putdirectly into the web-page data table 188. The new URL is assigned aunique identification number.

[0151] in Step 176, as shown in FIG. 4, web-page developers can submit aURL 187 and description 68 which also goes directly into the web-pagedata table 188,

[0152] web crawlers may also add URL addresses and descriptions (thedescription is either the first few lines of the web-page or in the HTMLcoded “title”). This is not an essential element of the system but itcould be a method to obtain URL's and descriptions. With this searchsystem web crawlers are more likely to be used to verify the informationrather than find new information.

[0153] Populating the Cumulative Surfer Trace Table 170

[0154] The cumulative surfer trace table 170, also referred to above asTable 4, is populated each time a “tagged” web-page is selected by auser. This sends a packet of surfer trace information, such that thesurfer trace data is added to the table each time the user selectsanother web page from a web page list.

[0155] Populating the Keyword URL Link Table 172

[0156] The data from the cumulative surfer trace 170 is used to updatethe popularity of web pages as recorded in Table 3 (X, Y), also referredto as the keyword URL link table 172. The frequency of updating Table 3with the data from the cumulative surfer trace (170) to obtain newvalues of X and Y is a variable that can be changed, from ranges thatare shorter than every hour to longer than every month. It should benoted that different keywords can be updated at different intervals oftime.

[0157] An intermediary step in processing the cumulative surfer trace isto form a cumulative surfer hit table. This is subsequently used tomodify the values of and X, Y in Table 3

[0158] As mentioned above, the simplest method of recording a link(“useful visit” or “hit”) between a keyword and a URL would be to counteach keyword, URL paring in a surfer trace as a “hit”. A more meaningfuland sophisticated method is only to count a location selection as avalid if the user meets certain criteria. This criterion could be theuser exceeding a specified time at a location. If this criterion was notmet, the selection would not be increase the cumulative value of X inTable 3.

[0159] It is also possible to increment the value of X based on the timespent at the web page. The longer the time spent the more thisincrements the value of X. X does not have to be a whole number.

[0160] Due to the variations in web-site capabilities in terms of log-ontimes, down loading times, bandwidth, and response times, thepredetermined time used to denote a valid ‘hit’ may be suitably altered.Specialist web crawlers may be employed to independently validate suchdata.

[0161] The selection of a content provider's banner after a keywordsearch counts as a hit for their web-page (increment the value of X).This will enable their web pages to possibly go up the popularity listassociated with the keyword. This acts as a mechanism to enable aweb-page developer to pay to be seen with a keyword. They can not pay togo up the popularity list—this will only occur if people visit theirsite and spend time there and record a valid hit for the popular list.The values of a content hit can vary (e.g. if could be 1 or 0.5 or 7)depending on the emphasis one wants to place how much that contentaffects the popularity ranking.

[0162] This cumulative surfer trace information can be processed in alarge number of ways to populate Table 8 (below). Grouping thecumulative surfer trace according to the IP addresses or user IDproduces the search pattern for an individual users. This is a list ofkey-words and URLs and times. This allow the time spent at each web-pageto be calculated for each user (it is not possible to calculated thetime spent at the last web pages of a search session as there is no timerecord after they go to that web page)

[0163] If the time between each visit is longer than a certain timeperiod, one is added to the cumulative surfer hit (α) table for thekey-word URL. (this is the simplest method, methods in which relevancyis proportional to the time spent at the site, for example, are alsoproperly within the scope of the present invention). TABLE 8 cumulativesurfer hit table created from accumulated surfer traces Key-wordKey-word Key-word Key-word URL address 1 URL address 2 α α URL address 3α α URL address 4 α URL address 5 URL address 6 α URL address 7 α

[0164] The cumulative surfer hit is used to update the value X in Table3 in the following way

X _((new))=(X _((old)) .HF)+α.

[0165] HF is the history factor which is a number between 0 and 1. Thehistory factor does not have to be the same for every key-word and couldbe varied depending on the rate at which the keyword is used.

[0166] The data collected for Table 8 is used to recalculate the valuesof X in Table 3 after a predetermined time period. The frequency ofupdating Table 3 will influence the value of the History factor (HF)chosen. The reason for multiplying the existing X by a “history factor”is so that the perceived popularity does not last indefinitely. Thehistory factor reduces the weighting attached to the past popularity. Toillustrate by way of an example, the key-word “sports news” may have anexisting popularity with the following ranking (based on the number ofhits per web-page, X) 1 Winter Olympics web-page X = 19000 2 Soccerresults web-page X = 18000 3 Baseball results web-page X = 15000 4 Golfnews web-page X = 15000

[0167] The cumulative surfer hit Table for a week may be: 1. WinterOlympics web-page α = 500  2. Soccer results web-page α = 1800 3.Baseball results web-page α = 1500 4. Golf news web-page α = 4600

[0168] The reason for the change in the number of hits reflects the factthat the winter Olympics has finished and the Master golf tournament hasstarted. If one has a “history factor” of 0.9 then the new popularity(X) will be: 1 Golf news web-page 18100 (0.9 × 15000 + 4600) 2 Soccerresults web-page 18000 (0.9 × 18000 + 1800) 3 Winter Olympics web-page17600 (0.9 × 19000 + 500) 4 Baseball results web-page 15000 (0.9 ×15000 + 1500)

[0169] Thus, the more popular web-pages can emerge and the less populardecline, reflecting the fluctuation of interest over time in differentsubjects and events.

[0170] The database is therefore utilizing the human mind to provide apowerful indication of what people find useful on the Internet. Theusers themselves replace a substantial computation requirement thatwould otherwise be required to filter through such searches.

[0171] The value of Y in Table 3 is the old value of X, and the value ofY will be updated at intervals that are deemed appropriate, whichinterval could be minutes, hours, days, weeks or longer. The updateinterval does not need to be the same for all different keywords, aspreviously mentioned. This is used to calculate the rate of change ofpopularity of web pages and can be used as a selection criteria.

[0172] Different Profile Type S in the Web-page/URL Link Table

[0173] The cumulative surfer trace includes information on usersprofiles so Table 8 can be calculated with subscripted values of α fordifferent profile types. These values of α₁₃ α₂ α₃ etc would correspondto the profile types for the subscripted values of X. This allows thepopularity of different groups of people to be recorded.

[0174] New Web-page Data Input to the Web-page/URL Link Table 172

[0175] The simplest method of having new pages recorded by the searchengines is for web-page developers to submit information, shown asaction 176 in FIG. 4, which information includes URL 66, key-words 70,site descriptions 68, target audience 72 and date-time 74, each timethey create or update a web-page.

[0176] This information directly updates Tables 2 (URL table 188 of FIG.4) and 3 (Keyword URL link table 172 of FIG. 4). The URL 66 anddescription 68 are entered in Table 2 and the date-time (74) at whichthe page is submitted (the Z value) is inserted in Table 3 for each ofthe key-words (70). Users are allowed a set number of keywords 70 withwhich they can submit their web page. An example of what Table 3 wouldlook like with just Z values is given below (format dd-mm-yy). TABLE 9Data Table created from submission by web developers Key-word Key-wordKey-word Key-word Key-word Key-word URL address 27/02/98 27/02/98 URLaddress 28/02/98 28/02/98 28/02/98 URL address URL address 18/02/9818/02/98 18/02/98 URL address URL address 28/02/98 URL address 29/02/98

[0177] If there is no date for the combination of the URL and keyword inTable 3, then the new date is automatically inserted. If a date alreadyexists in the Table, then the dates are compared and if the dates aretoo close, i.e. less than a predetermined period, then the old dateremains and the new date is ignored. This stops people from constantlyresubmitting to get on the top of the new web page list by resubmittingtheir web pages. If the URL in Table 3 has other keywords with values ofZ closer than the pre-determined period then the submission is also notallowed. This stops web-page developers from resubmitting their webpages with different sets of keywords.

[0178] When users submit a URL they could target it at specific types ofusers (different profile type s Z1, Z2, Z3 etc) as per Table 3. Forexample, an URL submission specifically targeted at New Zealanders (e.g.Z1) will appear at the top of keyword new list when New Zealanderssearch for that keyword. It will remain at the top until someone elsesubmits a URL for that keyword targeted at New Zealanders. URL's thatare targeted at other audiences will not appear as new sites for NewZealanders or alternatively they will not feature as high in the newlist as the ones specifically targeted at New Zealanders.

[0179] The data on new web pages does not necessarily have to be enteredby web-page developers. It could be automated by having a web documenttemplate that automatically submits data to the search engine wheneverthe information on the web-page has been significantly changed. It wouldprompt the web-page developer to change any key-words as appropriate.

[0180] Another embodiment requires sending specialist crawlers out tofind web site addresses and key-words, though this has many of thedrawbacks of existing web-crawlers. It could only be effective if webdesigners deliberately configured their page with the key-wordsidentified. Any web site designer/proprietor willing to do this wouldalso presumably be willing to submit any updates to the search engine tobenefit from the instantaneous listing on the search results.

[0181] An extension of this principle is to auto-detect if a web addresspossessed key-word information in the database and then automaticallysend an invitation to provide the information to enable their web-pageto be found easily. The ideal number of key-words to be submitted witheach web-page is preferably less than 50 and probably preferable withinthe range of about 5 and 20. This also advantageously forces web-sitedesigners to find the most appropriate keywords to describe their siteand also enable them to choose the audience they wish to target.

[0182] The web-page submission process may also include web-pagedeveloper identification process that restricts the ability of people touse the system fraudulently. This may include a payment to preventmultiple web-page submissions.

[0183] Populating the Profile ID Table 166

[0184] ID table 166 of FIG. 4 is populated from the direct inputs fromusers. When users search the can choose their profile type 54 from alayered drop down menu, which could include, for example:

[0185] Gender (Male or Female)

[0186] Occupation (Professional, student etc)

[0187] Age category etc

[0188] The user selects different profile types from the options theyare prompted if they wish to save this as their default profile type.This is then recorded in Table 5 (profile ID's table 166). The user mayalso select personalization options from a specific personalizationoptions page rather than a drop down menu on the search page.

[0189] Populating the Personal Link Table 174

[0190] The cumulative surfer trace is used to identify the searchpatterns of individual users based of sorting by User ID 126. Thisinformation is used to update the personal link table 174 in the sameway that the cumulative surfer trace 170 is used to update Table 3(keyword URL link table 172). This table stores users past preferencesas a form of automatic book marking.

[0191] Populating the Security Table 168

[0192] Each time a user enters a keyword 52 into the search engine itupdates the security table 168 (Table 7) by making a link between thekeyword 52 and the IP address 62 (or making a link between the keyword52 and the User ID 56). The data in Table 7 is cleared periodically asthe purpose is to stop systematic repeat searching from affecting thepopularity lists (value of X in Table 3) rather than stoppingindividuals who occasionally perform the a repeat keyword search fromaffecting the popularity list.

[0193] Determining the List of Web Pages

[0194]FIG. 5 shows the various data sets previously described, andvarious inputs and actions that result in a list of suggested web pagesbeing provided, and will be described in more detail hereinafter. Asshown in FIG. 5, user data entered into the search engine can include:keyword 52, user ID 56, search type 58, IP address 62, profile types 54.How this data can be used to determine a list of web pages 250 as welland deciding which of the list of web pages to tag (step 118 of FIG. 3)for the purposes of creating a surfer trace is described hereinafter.

[0195] The numbers (X, Y and Z) in Table 3, which correspond to keywordURL link table 172 in FIG. 5 contain all the information required togive the following types of searches 58:

[0196] Popular-list search ranked hit-list of the most popular URLs forthat keyword based on the number X

[0197] Hot off the press search ranked hit-list of newest URLs for thekeyword based on the date/time (Z)

[0198] High-flyers search ranked hit-list of best emerging URLs basedthe difference between X and Y

[0199] Random search hit-list that is a random sample of URLs that haveany of the numbers X, Y or Z

[0200] Date created search this is hit-list based on the date time Z andthe user-specified date of interest (not just the newest)

[0201] The personal links table 174 also allows past preferences to belisted as search results

[0202] Previous favorites search is a ranked hit-list base on theprevious popularity for the individual (X from Table 6). This search isbased only on the previous searching of the individual user. This allowsthe users to very quickly find site that they have previously visited.

[0203] A number of other search options are also available.

[0204] Conventional search is the list of search results from a normalsearch engine (116 FIG. 3)

[0205] Other content only search. This is a list of other content, suchas advertisements, associated with the key-word.

[0206] These search results can be combined in a number of differentways

[0207] Collective search ranked hit-list that is a collection of any ofthe search hit-lists described above (this is the default set of searchresults)

[0208] Customized search ranked hit-list that can be a user definedcombination of any of the above lists.

[0209]FIG. 5 also illustrates the use of keyword table 164 and securitytable 168 in a decision 246 to send out tagged web pages. This decisionis based upon the frequency of key word usage, the data in the securitytable and the presence of a user identification. The details of thedecision to send out tagged web pages is described fully in FIG. 16.

[0210] How the Different Types of Search Lists are Implemented

[0211] More details on how each of these types of searches isimplemented is provided below along with some of the advantage anddisadvantages of each. The system relies on the brain power of the user,this time to determine what sort of search they want to do which willdepend on what they want to find. The search methods are describedeasily so users should intuitively know which one to use.

[0212] Popular search.

[0213]FIG. 6 illustrates the process for determining a list of popularweb pages associated with the entry of a keyword 270 in step 272. Ifthis search is selected and a keyword is entered, step 274 follows andproduces a list of web pages based on the values of X taken from Table 3(172, FIG. 5) for the keyword 270 entered. These web pages areidentified by a unique web-page(URL) number from Table 3. Thereafter, instep 276 the list of web-page numbers found from step 274 is combinedwith the URL address and web-page description from Table 2 (188 FIG. 5).In step 278 the resulting list of web pages is then tagged, depending onthe results of step 246 in FIG. 5 as described previously, and sent tothe user for them to make their selections.Hot off the press search.

[0214]FIG. 7 illustrates the process for determining a list of new webpages associated with the keyword entered in step 290. If this search isselected and a keyword is entered, step 294 follows and produces a listof web pages based on the values of Z taken from Table 3 (keyword URLlink table 172 of FIG. 5) for the keyword entered in step 290. These webpages are identified by a unique web-page (URL) number from Table 3.Thereafter, in step 296 the list of web-page numbers found from step 294is combined with the URL address and web-page description from Table 2(URL table 188 of FIG. 5). In step 298 the resulting list of web pagesis then tagged depending on the results of step 246 in FIG. 5 asdescribed previously, and sent to the user for them to make theirselections.

[0215] The user will also be able to see exactly when each web-page wassubmitted so Internet users can be aware of its currency. An indirectconsequence of this feature is the incentive for web designers to updatetheir sites. The prominence given to new and updated sites provides ameans of becoming established on the popular hit-list and encourages theuse of appropriate key-words and rewards the up keeping of web pagesthat users find useful.

[0216] High-flyers search.

[0217]FIG. 8 illustrates a high-flying web pages search associated withthe keyword entered in step 320. This is a list of web pages that areincreasing in popularity fastest. If this search is selected and akeyword is entered, step 324 follows and produces a list of web pagesbased on the relationship between the values X and Y taken from Table 3(172, FIG. 5) for the keyword 320 entered. These web pages areidentified by a unique web-page (URL) number from Table 3. Thereafter,in step 326 the list of web-page numbers found from step 324 is combinedwith the URL address and web-page description from Table 2 (188 FIG. 5).In step 328 the resulting list of web pages is then tagged depending onthe results of step 246 in FIG. 5 and sent to the user for them to maketheir selection.

[0218] The high-flyer list is calculated by comparing the old popularranking (Y) and the new popular ranking (X) from Table 3. From this thepercentage increase in hits is calculated. An alternative method wouldbe to rank the rate of change of popularity by the number of places theyrose compared to last time.

[0219] The formula of calculating the rate of change of popularity forthis embodiment is given by:

((X−Y)/Y).(X/(X_(m)β))

[0220] where X_(m) is the maximum value of X for the correspondingkey-words and β is an additional variable that can be changed to alterthe relative significance of changes at the top and bottom of thepopularity list.

[0221] The reason for multiplying by the maximum value of X is to ensurethat small changes at the lower popularity levels do not swamp moresignificant changes higher up the table. For example, a web site havingpreviously recorded only one selection and then attracting 5 hits thenext day would exhibit percentage increase of 500% whilst anotherweb-page may have experienced an increase from 520 hits to 4000 hits (amuch more significant increase) though this would otherwise appear as alower percentage increase.

[0222] Random search.

[0223] This is a random selection of less-popular web-pages for the userthat want to look at web-pages off the beaten track, based upon a randomselection of web pages that has any value of X, Y, and Z associated witha keyword that is entered. Accordingly, after a user enters a keyword instep 352 as indicated in FIG. 9, reference is made to the keyword URLlink table 172 illustrated in FIG. 5, and a random list of web pagesnumbers are generated automatically using a random number generator aredetermined, as illustrated at step 354. Only web pages that have valuesfor X, Y or Z associated with the key word are chosen in this randomselection as this indicates that at some stage in the past as used orweb page developer thought the web page had some connection to thekeyword. Thereafter, in step 356 the list of web-page numbers found fromstep 354 is combined with the URL address and web-page description fromTable 2 (188 FIG. 5). In step 358 the resulting list of web pages isthen tagged, depending on the results of step 246 in FIG. 5 as describedpreviously, and sent to the user for them to make their selections.

[0224] Conventional search.

[0225] This is the normal search method of a conventional search engine,referenced as other search engine 116 in FIG. 3, which may or may not beincluded along with the searches according to the present invention, atthe option of the user, as noted previously.

[0226] Content only search.

[0227] This is a list of content, such as advertisements, associatedwith the key-word, which the user cannot control. The ones that havepaid the most will be at the top of the list, as described furtherhereinafter, in accordance with the preferred embodiment of theinvention. Of course, other systems for identifying the order of payingcontent providers can also me implemented.

[0228] Previous favorites search.

[0229]FIG. 10 illustrates a previous past favorites search, that isbased only on the previous searching of the individual user. This allowsthe users to very quickly find sites that they have previously visitedand performs, therefore, automatic book marking. It should be noted thatsince a password is preferably used to logon to the search engine systemaccording to the present invention, the user will be able to accesstheir personal preferences from any computer.

[0230] Thus, when the user types in a keyword at step 372 as indicatedin FIG. 10, step 374 follows during which it is determined what are thefavorite sites (based on previous usage) for that keyword from thepersonal link table 174 illustrated in FIG. 5. Because the user has apassword that can be used to logon to the system the user will thus beable to access their personal preferences form from any computer.

[0231] Due to this search capability there is, therefore, no need tomanually bookmark web pages. If a user forgot to book-mark a good siteon, for example, ‘marbles’, they can easily find it by retyping thekeyword that lead them to that site. If a user's preferences change theywill be reflected in the personal links table 174.

[0232] Another embodiment of the personal preference search includesspecifying the date the web page was last visited, with or without usinga keyword. The web pages are then ranked based on Z in personal linkstable 174 of FIG. 5. For example if a user looked at a site in themiddle of last year the user can refine the search by date, thus makingit easier to find a previously useful web-pages more easily, even ifthey could not remember the relevant keyword.

[0233] This automatic book-marking feature can also act as a device formonitoring the type of Internet use being undertaken by a particularcomputer and thus for example, can provide warning to parents/employersof children/employees accessing undesirable sites, such as adultweb-pages. In a preferred embodiment, for parents/employers unlikely touse the computer themselves, notification of such usage is automaticallyprovided by letter to the parent/employer that lists the keywordsselected and web pages visited by the children/employees. Thisinformation is found directly from each user table 174 of FIG. 5. Thisrequires a user identification code that also included parental/employeeinformation.

[0234] Collective search

[0235] The collective search, as illustrated in FIG. 11, is the defaultsearch according to the present invention and is used when the user doesnot actively choose on of the other search options.

[0236] Upon entry of a keyword in step 402, that keyword is used toselect from a combination of web page selections associated with thatkeyword. As shown, for example, in step 404, an equally weightedcombination of conventional, popular, highflier, new and past searchresults is used to obtain a list of web page numbers. Thereafter, instep 406 the list of web-page numbers found from step 404 is combinedwith the URL address and web-page description from Table 2 (188 FIG. 5).In step 408 the resulting list of web pages is then tagged, depending onthe results of step 246 in FIG. 5 as described previously, and sent tothe user for them to make their selections. the system is firstconfigured, the search engine 10 database will not posses anyinformation on popular, high flyers and new web page hit-lists, sosearch results will initially be obtained from the conventional hit-list(normal search engine), and the tagged web pages then used to create thedatabase sets as have been described. As the system develops, the datasets associated with each of the other search types will becomepopulated, and searches using the other search types will become moreuseful.

[0237] Date created search.

[0238]FIG. 12 illustrates a date created search that allows the user toselect the date that the web-page was submitted. This feature will onlywork for web-pages that contain a date created data entry, identified asdate-time submission 74 in FIG. 4. Upon entry of a date-time and/or akeyword in step 432, the search engine 10 will perform step 434 in whicha list of web page numbers associated with these variables is obtained.Thereafter, in step 436 the list of web-page numbers found from step 404is combined with the URL address and web-page description from Table 2(188 FIG. 5). In step 438 the resulting list of web pages is thentagged, depending on the results of step 246 in FIG. 5 as describedpreviously, and sent to the user for them to make their selections.

[0239] Customized search

[0240]FIG. 13 illustrates a customized search that allows the user todecide how they want their default hit-list to appear. In step 462, thekeyword and User ID is selected in order to initiate the customizedsearch. Prior to initiating the customized search in step 466, whichstep is identical to step 404 of the collective search previouslydescribed with respect to FIG. 11, however, step 464 is applied tocustomize the users default mixture of hit-lists For example a user maywant their default search results to include only popular and new webpages but no high flying web pages. This custom search is then performedin step 466 to generate a list of web page numbers. Thereafter, in step468 the list of web-page numbers found from step 466 is combined withthe URL address and web-page description from Table 2 (188 FIG. 5). Instep 470 the resulting list of web pages is then tagged, depending onthe results of step 246 in FIG. 5 as described previously, and sent tothe user for them to make their selections, one preferred embodiment,the make-up of the default search results list can be amended by‘learning ’ from the user's behavior to create a changing customizedsearch based on the user's own search patterns. If a user consistentlychooses new web pages or high-flying web pages for example, then theirset of default search results will be changed to reflect their normalsearch style.

[0241] Magazine search.

[0242] The magazine search according to the present invention enablesusers to search by following a series of menu-driven subject choices (orsimilar hierarchical structure), rather than entering a specifickey-word(s).

[0243] Existing magazine-style search engines require editors to set thestructure of information, decide on its relevant merits and set thecriteria, such as price, for space on a given page transmitted to theuser/viewer. Using the search system of the present invention, theusers' themselves dynamically decide what is and is not worth seeing.Thus, although editorial input is needed regarding a hierarchy ofsubjects, the web-pages that emerge as the most popular for each ofthese subjects will evolve automatically.

[0244] Use of Data Sets for Different Groups of People

[0245] Different popular hit-lists may be employed to provide resultswhich would reflect different cultural, geographical, professional,gender or age interests. Thus, as shown in FIG. 14, when a user enters akeyword and User ID in step 490, the default profile of the user can beused to reflect the type of web pages that people of the same “group” asthe user profiles desire to see. Thus, the search that takes place instep 494 is based on the subscripted X, Y and Z values obtained from thedefault profile of people of those “group” affiliations identified inthe user's personal profile obtained in step 492. Thus, the rather thanan overall global search result, search results are obtainedparticularized for the group that the user identifies with. Theresulting list of web pages, derived from steps 496 and 498, as havebeen previously described, are particularized for that group.

[0246] Thus, for a particular user with the profile type New Zealandselected as a geographical factor, a search for team field sports andrelated key-words, rugby material might figure prominently, whereas anAmerican profile type may produce a bias towards baseball/Americanfootball material, for example. This technique offers the ability todiscriminate between the different meanings of the same words, accordingto the context of the popular hit-list associated with a particularprofile type. A general search using a key-word ‘accommodation ’ forexample would include results related to housing, renting and similar,whereas if the user indicated an interest in optometry in their profiletype, then the term ‘accommodation ’ would be interpreted quitedifferently.

[0247] The relevance of such sites will evolve automatically, withoutany active evaluation of the sites by the search engine operator or theuser. There are no complex algorithms required to analyze the relevanceof web-sites for particular types of users. Instead, the type of sitedeemed relevant will be decided by those users selecting thosecharacteristics for their profile type, i.e. American females interestedin rock-climbing. Sites of greater relevance will naturally attract morehits, increasing their ranking and thus increasing the chance of asubsequent user also investigating the site. In the above example, anyweb sites listed for the keyword ‘accommodation ’ which were unrelatedto optometry, sight, lens, vision, etc., would not be accessed for theperiod of time required to make a valid hit. It would therefore receivea very low ranking and hence be even less likely to be accessed byfurther users.

[0248] The user can select different profile types for differentsearches during a single session and is not be restricted to the defaultprofile types.

[0249] In a further embodiment of the invention, there can be included alevel of authentication for person's of a certain group to have theirsearch results actually be used for purpose of updating the databaserelating to that group. For example, doctors who have a user ID thatidentifies them as doctors may perform a search related to a certainmedical condition, and their selections can be tagged and used in thedatabase for that group of doctors as has been previously described.However, although patient's may desire to identify their profile withthat of the same group of doctors, their selections are not assignificant as those of the actual doctors, and thus while they are ableto view the web page listings that doctors deem most pertinent, theirselections are not used to update the doctor's group database, sincetheir IDs do not identify them as a doctor

[0250] Limiting Search Options

[0251] Another feature of the present invention is keyword eliminatorfeature, which is illustrated in FIG. 15, and prevents certain users,such as children, from searching for undesirable keywords and web-pageswhen the keyword eliminator feature is turned on. The present inventor'shave realized that it is potentially much easier for example, to stopchildren searching for pornography, rather than attempting to trace andprevent access to all sites on the Internet with pornographic content.This would be used as a complimentary tool to existing “net nanny” typedevices. Thus, as shown in FIG. 15, with the keyword eliminator turnedon, a preexisting table inaccessible keywords is stored in a table andcompared in step 522 with a keyword previously entered, as shown by step520. Thus, keywords that are inaccessible will not be searched. Thus,for example, parents could choose the types of keywords 552 that they donot want their children to search for—and this will be different fordifferent sets of parents. The system filters out the keywords that maybe used for subsequent searching in step 524.

[0252] Determining which Users to Sample

[0253]FIG. 16 illustrates the process of determining which searchresults should be sampled and used to make up the cumulative surfertrace table 170 of FIG. 4, also referred to as Table 4. While possible,it is not necessary to collect data concerning every single search, andthis can be controlled by determining which sets of results get sent outwith “tagged” web pages. Reference with respect to this was alreadymentioned with respect to authenticating user's of a particular group,doctors in the example provided.

[0254] As shown in FIG. 16, after entry of keywords and other data instep 554, there are three decisions that determine whether results areactually “tagged” as has been previously described in step 118 of FIG.3.

[0255] As shown by step 556, for a user that has a user ID and haschosen to use the personal links table 174 of FIG. 5 (Table 6) aspreviously described, it is necessary to “tag” all of their results sothat all of their past preferences are recorded in their personal linkstable 174. The search engine according to the present invention systemcan update the user's personal preferences but not update Table 3 ifcertain security levels have not been satisfied (see below). If,however, the personal link table 174 is stored on an individual'scomputer rather than at central location there is no need to send outtagged results as the data is stored locally.

[0256] As shown by step 558, when a keyword is submitted, a check ismade that the IP address 62 has not already searched the keyword usingsecurity table 168 (Table 7) before the user is sent a set of taggedresults. If so, the user can still undertake the search though it willnot contribute to the cumulative surfer trace 170 (Table 4). This allowsall normal users to affect the popular hit-list and all users to searchwhatever they would like, but prevents fraudulent users, such asspammers, from contributing to the popular hit-list. The security table168 can also include information on links between keywords 52 and a userID 56 to detect repeat searching.

[0257] While it is possible for user's to change the IP address of theircomputer, this is also detectable and preventable by a number of methodssuch registering and tracking the use of IP numbers.

[0258] methods to exclude false searches include:

[0259] Only creating a surfer trace for users with a user ID 554recorded with the search engine.

[0260] Extending the time limit requited to make a visit count as auseful hit.

[0261] Do not count single visits to a URL from a keyword (for whichthere is no means of measuring a lapsed-time).

[0262] As shown by step 560, popular keywords can be traced once everytenth, hundredth, or even thousandth occurrence, and the frequency ofthis selection can be changed to optimize the system. The frequency ofkeyword usage is determined from keyword table 164 as shown in FIG. 5(Table 1). The frequency of sending out tagged results can also belinked to the rate at which popularity is changing for different keywords. For example the keyword “IBM” would probably have IBM's home pageat the top and most user's would go there, whereas the key word “latestfads” may have a constantly changes set of web pages that needs to besampled more frequently.

[0263] To avoid the keyword URL link table 172 of FIG. 5 (Table 3) frombecoming unduly large, one method is to only register keywords in Table3 once they reach a certain frequency of usage. This is controlled bynot sending out tagged results for less frequently used keywords (foundfrom Table 1).

[0264] Active Suggestion of Web Pages to Visit

[0265] Another feature of the present is illustrated by FIG. 17, andinvolves using data to actively suggest web pages. This is differentfrom a search because the user sets up the request and is informed ifthere is any new data on the subject. To do this the users has toactively specify which keywords they are interested in and the profiletype that they would like to act as a filter or agent and the searchtype (new, highflying, popular) in step 588. This information is storedin the user's profile ID 166 shown in FIG. 5 (Table 5).

[0266] Thus, at various interval's the user receives a list of suggestedweb pages determined by a group of like minded humans. For example auser may choose to be notified of web pages with the following Keyword582 profile type (agent 588) Search type 586 Rugby New Zealand, Malehighflying Decay treatments Dentist new

[0267] This way if there are highflying web pages on “rugby” that otherNew Zealand males found useful (i.e. they spent a significant amount oftime looking at the information—high rate of change of X in Table 3) theuser would be notified. Similarly if there was any new information on“decay treatments” submitted for dentists to look at, the user would beidentified about it (value of Z in Table 3). It is unlikely that acomputer agent will ever be as good at filtering information as aselected group of peers. An advantage of this system compare to other“agent type” software is that this does not require any software on theuser's computer. It is all included as a natural extension to the othersearch engine data sets.

[0268] The suggested web-sites can be displayed for the user when theynext access the search engine or they may choose to be notified of thesesuggested web pages via e-mail notification. This way web pages can bedrawn to the user's attention without any active searching for thesekeywords.

[0269] Passive Suggestion of Web Pages to Visit

[0270] Another feature of the present is illustrated by FIG. 18, andinvolves automatic web-page suggestion based on how the user hassearched in the past and requires no active input from the user.

[0271] As shown, in step 620, upon the entry of a user ID, the systemcan be activated passively, at various intervals or times (such as ateach login to the search engine), by looking at which keywords, profiletypes and search types, the users frequently looks at using the personallinks table 174 of FIG. 5 (Table 6). For example, it may be that theuser frequently looks at Rugby information as a “New Zealand, male” andlooks at decay treatments as a “dentist”. This information can be foundfrom the automatic book marking table, previously referred to personallinks table 174. If the user has not looked at these subjects for acertain length of time and there are new or highflying informationsources, the user will be automatically notified of these newinformation sources.

[0272] In a modification of this embodiment, a periodic e-mail can besent out with the two newest and highest flying sites related to thekey-words of the user.

[0273] Determining a List of Suggested Keywords

[0274] A problem with Internet searching for many users is knowing whichkey-word to use for searching. While the present invention could beimplemented with an infinite number of keywords, too many key-words(includes phrases) that users choose can be problematic.

[0275] Accordingly, as shown in FIG. 19, the present invention alsoprovides for a data set 642 that provides synonyms for the keywordsentered along with the particular profile type in step 640. The systemrepresented in FIG. 19 is referred to as a key word suggester. This isimplemented, in one embodiment, by matching the key-word entered by theuser in step 640 with the existing key-words and phrases in keywordtable 164 of FIG. 5 (Table 1) that other users have tried using othersearch methods, identified in step 646. Each keyword is then tagged instep 660, and those that are selected by a user in step 662 are used toform a keyword surfer trace 648 as shown in FIG. 19, which contains theoriginal keyword 52 that the user entered, the keyword selected 652, andthe IP address 130, user ID 128 and date-time 132 data as in thepreviously described web page surfer trace.

[0276] The data from the cumulative keyword surfer trace 648 is thenused to reinforce links between keywords. In this way the system learnswhich keywords are associated with each other. The system learns whichwords are related to each other in the same way that the system learnswhich URL's are associated with the key-words. The lists of suggestedkeywords will become more relevant over time as the relevancy isimproved each time the keyword suggester is used.

[0277] Creating Data Sets that Determine the Suggested Keywords

[0278] As shown in FIG. 20, a keyword link table 696 and a cumulativekeyword trace table 698 are used along with the previously describedsecurity table 168 to create the data sets for suggested keywords. Thekey-word link table 696, shown in Table 10 below, records how often eachkey-word is selected from the suggested key-word list. This can then beused to rank the of the usefulness of different key-words relative toeach other. TABLE 10 Keyword link Table Key- Key- Key- Key- Key- word 1word 2 word 3 word 4 word 5 Key-word 1 — 5 Key-word 2 20 — 1134 Key-word3 356 — Key-word 4 — Key-word 5 20 — Key-word 6 3 Key-word 7 168

[0279] It can be seen from the Table 10 that people who entered key-word2 found key-word 3 the most useful followed by keyword 5 thenkey-word 1. The keywords can have a directional aspect, for example,keyword 3 was found useful 1134 times after trying keyword 2. Howeverkeyword 2 was found useful only 356 times after users tried key-word 3.

[0280] Information about the links between keywords in Table 10 isupdated by the information about how people are using suggested keywords(keyword surfer traces 648). The cumulative keyword surfer trace 698 isthe combined information from all individual keyword surfer traces 648and it is used to determine how many “hits” (significant visits) eachkeyword had for each key-word.

[0281] The information collected from each individual surfer trace is aseries of inputs become a cumulative keyword surfer trace, shown intable form below in Table 11. TABLE 11 Keyword cumulative surfer traceKeyword keyword IP Number User ID (original) (suggested) Date-time

[0282] Populating the Keyword Kink Table

[0283]FIG. 20 also illustrates how links between keywords in Table 10can be initiated by recording sequences of keywords that users put intothe search engine. If, for example someone searches using the keyword“NHL” and then “National Hockey League”, this would then draw anassociation between these two key-words in Table 10 by recording this asone hit. Again this captures the reasoning power of users to define thelink between two keywords. Often the keyword in sequence will be totallyunrelated to the previous key-word but sometimes it will be relevant. Ifthe next user chooses it from the key word selector it will reinforcethe key-word link in the same way that repeat selection to web pagesreinforces links between a keyword and a URL.

[0284] The following is an example of keywords that may be suggestedafter entering the a simple key-word like “Book”

[0285] book sales

[0286] book reviews

[0287] specialist books

[0288] second hand books

[0289] used books

[0290] special edition books

[0291] All of these key-words (phrases) would come from informationseekers (users) and information providers (web-page developers). Themost appropriate keywords will emerge naturally over time.

[0292] All keywords used by users are entered into the key-word linktable 696 of FIG. 20. Thus, if people enter an uncommon keyword such as“cassettes” instead of “cassettes” the key-word suggester will suggestthat the user tries “cassettes”. There is therefore, no need to create aset of URL-keyword links in Table 3 for “cassettes” Thus saving on dataspace and there is also no need to send a tagged set of results for thekeyword “cassettes”. Hence there will be less data sent back to thesearch engine.

[0293] It is also a contemplated embodiment to run the keyword suggesterlike Table 3 and have high flying keyword associations and new keywordassociations so the system can learn how keyword associations changeover time. For example, the keyword suggester trace can store the mostrecent keyword links and modify the main key-word trace by a historyfactor, in the same way as Table 3 is modified by the cumulative surfertrace.

[0294] The cumulative keyword surfer trace 698 is processed in the sameway as the cumulative web-page surfer trace 170 of FIG. 5 to reinforcelinks between keywords in the keyword link table 696 (Table 10). A timevariable can also be included so that if a user chooses another keywordvery quickly it is assumed that the previous keyword was not useful andis not counted as a keyword surfer trace.

[0295] Also, the individual keyword suggester can store, for each user,their personal keyword links. Further, the keyword suggester can bebased on a number of different profile types. The word associations maybe quite different for people of different culture, nationality,occupation and age etc. Different keyword suggesters can capture thekey-word association of different groups of people. The keyword hits inTable 10 can be subscripted in the same way that the values of X, Y andZ are subscripted for different types of profiles in Table 3, asexplained previously.

[0296] Using the Tables to Create a List of Suggested Keywords

[0297]FIG. 21 illustrates a variety of manners in which a list ofsuggested keywords can be created.

[0298] One manner is by ranking the values of X in the keyword linktable 696 (Table 10). This ranked list of keywords is combined withkeywords from a normal search of keywords, described previously withrespect to step 646 of FIG. 19.

[0299] Another manner of suggesting keywords, shown as step 730, is tocompare the popular list (URLs X values) for the user-entered key-wordwith the popular-list of other key-words in Table 3. A similaritypattern X values in Table 3 indicates that these keywords are similar.For example a user may search for “film reviews” and the keywordsuggester may come up with “movie reviews” which has a morecomprehensively searched list of sites. In this case there is nophysical similarity between the words movie and film, but they arelinked by the similarity of the patterns of URLs links they have incommon in Table 3.

[0300] The usefulness of the key word suggester list is enhancedindicated by step 744, by associating with each key-word on thesuggestion list an indication of whether there are any of theaforementioned searches available (popular, high flyer, etc.) for thatkey-word in keyword URL links table 172 of FIG. 5 (Table 3). Thekeywords with the most search results are then highlighted.

[0301] Decision to Send Out Tagged Keyword Suggestions List.

[0302] The security table 168 and keyword link table 696 are used todetermine which keyword links to sample in a manner similar to thatpreviously described with respect to tagging web pages. As with thedecision for tagging web pages this can depend on whether it is a repeatkeyword (found from security table 168) and on the frequency of keywordusage (found from keyword table 164), as well as the considerationspreviously discussed.

[0303] Determining Other Content

[0304] When searching on the Internet, various different web pageslistings and web pages are displayed as has been described. One commoncharacteristic of each these different web page listings that have beendescribed is that when they are displayed they appear substantiallyidentical to one another. As shown in FIG. 25, each of the differentlistings 900, though the text may be different, is otherwise visuallyidentical. Other listings 902, however, are many times larger than thelistings 900, may include graphical content, and appear more prominentwhen displayed to the user. Such listings can contain the same contentas a web page listing, or other content, such as advertisements,pictures, editorials and the like.

[0305] This other content may be displayed to a particular user basedupon key-words, user profile type (nationality, age ,gender, occupation,and so forth) and the time of the day, for example.

[0306] In many instances, this content that is displayed along with webpage listings is inserted into the display area using mechanisms thatare different from the searching system described previously withrespect to conventional search engines. The mechanism by which thiscontent is displayed in large measure based upon some other criteria,such as payment for the space that is used. While the system forselecting this content works, it is difficult to keep track of whichcontent was displayed when, especially if that content is frequentlychanged. Thus, another aspect of the present invention, which will nowbe discussed, is a system for tracking changing content, and allowingfor content providers to dynamically select when their content will bedisplayed.

[0307] This dynamic selectable content, as illustrated in FIG. 22, maybe displayed to the viewer based upon keyword or profile type as enteredby the viewer in step 762 as shown. Within the content selector step 764that then follows, the time of the day is considered and used inselecting the appropriate content 902 as illustrated in FIG. 25 alongwith the web page listings 900. Each content 902 transmitted with thesearch results made up of web page listings 900 is tagged in step 766.Thus, if a user in step 768 selects that content 902, the results ofthat selection is fed back to the content selector 764 so that thecontent database associated therewith, can be updated as surfer tracedata in a manner such as has been previously described. Thereafter, instep 770, that content 902 is displayed, typically simultaneously withcontent 900

[0308] In addition to the surfer trace data being input as has beenpreviously described, this content embodiment also provides for the webpage developer, or content provider, to determine the frequency withwhich this content will be reviewed, and, depending upon the patterns ofusers with respect to web page listings that are viewed, alter themanner in which the content provider's content 902 is displayed basedupon key words, user profile and the like. In order to implement thisdynamic content flexibility, there are three additional data tables,illustrated in FIG. 23, which are used to track the changing content902. These tables are keyword content data table 804, personal profilecontent data table 806; and content provider data table 812.

[0309] Keyword content data table 804 is illustrated in more detail inTable 12 below, and its characteristics are:

[0310] H is the cumulative number of hits for one time period for thekeyword. This is the number of times people choose that keyword;

[0311] N is the number of times particular content 900 that isassociated with a keyword has been sent out for display. This is notnecessarily the same as H since content associated with a profile typemay be have a different selection factor than content associated withthe keyword. This selection factor can be various variables, such asvotes or price;

[0312] A is the selection factor for the keyword from each contentprovider (e.g. a selection factor could be a $ bid to be associated withthat keyword);

[0313] T is the total of the selection factors for each keyword and isthe sum of A's; and

[0314] P is the content value, as determined by votes or price, for eachkeyword and is T/N (e.g. this could be the $ per time content is sentout with that key word—this is a price of being associated with that keyword) TABLE 12 Keyword content data sets Amount of Cumulative ContentContent Content hits for one sent out Provider 1 Provider 2 TotalKeyword month (H) (N) (A1) (A2) (T) (P) Books Fish

[0315] This Table can also include the maximum content value M that thecontent provider is prepared to give. There is no limit to the number ofcontent providers that may attempt to have content 902 displayed with aweb page listing that is associated with a particular keyword.

[0316] It is possible to have a separate Table 12 for each country orarea, so that the content value per country or area, per keyword couldbe different. In addition there could be different content values fordifferent time periods in each country or area.

[0317] It is possible that provider's of content 902 could target boththe key-word and the audience by identifying each of the keywords withtarget audiences, e.g. the number of hits associated with the word rugbycould be broken down into the different profile type s that search forthe word rugby. The cumulative number of searches for rugby could be6000 split into 520 under 21's and 4000 21-50 year olds and 520 50+ agegroup. Thus, there may be a different content value for each of thesesub classes within a keyword search.

[0318] In addition to the key-word dataset 804 it is possible to have adata set of the following type for different profile types 806. Itcontains the same entries for each profile type, instead of keyword asdescribed above with respect to the keyword content data table 804 ofFIG. 23. TABLE 13 Personal profile content Table Amount of ContentContent cumulative Content Provider Provider Profile hits for one sentout 1 2 Total type month (H) (N) (A1) (A2) (T) (P) Male Female Profes-sional etc Undefined profile

[0319] Table 13 determines the content value of the content 902 tospecific audiences of people as opposed to different keywords and allowsfor targeting of specific audiences.

[0320] It is within the scope of the present invention to includecombination profile types in Table 13 as well, such as male,professional or New Zealand, females. The content value for the combinedprofiles will be different than the content value of individualprofiles. The mechanics involved in determining the content value andchoosing the content 902 will be the same, and described furtherhereinafter.

[0321] Content provider data table 812 of FIG. 23 is illustrated in moredetail below as Table 14 and contains information about the contentprovider, such as name, address, advertiser, content information such asthe Bitmap (HTML or Java applet or similar) that the content 902 willuse and a unique number to identify each different item of content 902.TABLE 14 Content Name Address etc Information Unique number for eachContent E.g. John Content. no. Content. no.

[0322] This Table may also store details of the content provider, suchas passwords, payment details (e.g. credit card number andauthorization), content delivery (number of times content has been sentto users) etc.

[0323] The data sets for the above mentioned content tables arepopulated as follows. For the keyword content data table 804

[0324] H, the cumulative number of hits for a particular key word forone time period, is taken directly from Table 1 (800).

[0325] N is the number of times content is sent out associated with thekeyword. This is incremented each time an item of content 902 isdisplayed to a user that is specifically associated with that keyword810.

[0326] The values for A 802 are selected by content providers for eachkeyword. The content provider can also enter a maximum value M overwhich they will no longer select to be sent out with the keyword.

[0327] T is the total for each keyword and is the sum of As

[0328] P is the content value, as determined by votes or price, for eachkeyword and is T/N

[0329] Populating the Personal Profile Content Data

[0330] H is the cumulative number of hits for each profile type and thisinformation is taken directly from Table 1 (sum of the indexed W's).

[0331] N is the number of items of content 902 sent out associated withthe personal profile. This is incremented each time an item of content902 is sent out that is specifically associated with that profile type810

[0332] The values for A 808 are placed, through an entry process akin tobidding, for each profile type. The content provider can also enter amaximum M they are prepared to pay, or vote, as the case may be.

[0333] T is the total for each profile type, and is the sum of As.

[0334] P is the content value for each profile and is T/N

[0335] Populating the Content Provider's Details Table

[0336] The majority of the content provider's details 812 areelectronically entered by the content providers. Each time a contentprovider's content 902 is sent out this event is also recorded in thecontent provider's details Table 812. This will also record the numberof click-throughs (820,822,824,826,828) and the cost, in terms ofpayment or votes, of the content 902. This will form the basis of theelectronic bill or tabulation that is thereafter forward to the contentprovider.

[0337] How the Data Sets are Used to Select Content Sent Out to Users

[0338] In the discussion that follows, with reference to FIG. 24, it isassumed that only one banner of content 902 is transmitted with each setof web page search results 900. The same algorithms apply if there aremultiple sets of content transmitted with each set of web page results.

[0339] A keyword and profile type are submitted to the search engine instep 852. From keyword content data table 804, personal profile contentdata table 806, the value of content 902 for each is found from thevalue of P in the Tables. The highest value of P for the keyword orprofile type, determined in step 862, determines the type of content(keyword or profile type) that is transmitted along with the web pagelistings 900. It may be that there is no specific value for the keywordand the user may not be using a specific profile type. In this case thevalues for unassigned content items will be used (from Table 13 forusers without a profile). Choosing which specific content item 902 issent out is discussed below. The details for the content item (theirgraphics, text, associated programs, etc) are taken from Table 14,content provider details table 814 and transmitted to the user in step868. Details of the content items 902 transmitted for each contentprovider are also sent to the content provider, as shown by step 870, atregular intervals.

[0340] Determining whether it is Keyword or Profile Content that isTransmitted

[0341] The type of content 902 transmitted is dependent upon whether itis a key word based content or profile option based content. For examplea Male from the US may search for fish. The value applicable to thissearch is, keyword=fish, profile=male, profile=US, profile=US, male.When deciding which content gets displayed, the system compares thevalue of the content for all the possibilities (keyword, combinations ofprofile types) and sends out the content that has the most value, asdetermined in step 862. For example an under 21 male may search usingthe key-word “Rugby” and the value for the associated content for Rugbyis 0.1 per view, whereas the value per view for targeting an under 21male is 0.2 and thus the content targeted at the male under 21 would bedisplayed rather than the rugby content. It is important to note thatthe cumulative frequency of times that content items 902 are transmitted(N) will be different to the total cumulative frequency for the targetedarea (H). In this example the cumulative frequency (H) of the number oftimes ‘rugby ’ is searched for and ‘males under 21’ would bothincremented by one (via Table 1). However, the number of times an itemof content 902 is displayed would be incremented only for the ‘maleunder 21’ Table (this is the figure used to determine the value of thecontent per unit view.

[0342] Determining which Specific Content is Transmitted

[0343] The example below shows how content associated with the keywordis selected. It is the same process for content associated with profiletypes. Number of Cumulative content Content Content Key- hits for oneitems sent Provider 1 Provider 2 Total word month (H) out (N) (A1) (A2)(T) (P) Book 134 134 10 10 0.050 Fish 52 80 5 5 10 0.52

[0344] For the key-word “book” the content 902 of content provider 2would be displayed whenever the keyword was searched, as they are theonly content provider associated with that key-word. However, for thekey-word “fish”, content providers 1 and 2 would have their content sentout the same number of times. In the system scaled to the levels atwhich it is intended to be used, there will be a very large number ofcontent providers bidding for different keywords and profile types.

[0345] Calculating the Value of Content

[0346] If there is a new content provider who, for the keyword “book,”values the content at, for instance, $5 per month, this will change thevalue to 0.075 and this will mean that the total associated with theword book is $15. Therefore, content provider 2 would now gettransmitted 66% of the time (10/15) and the new content provider wouldbe displayed 33% of the time. The proportion of time an contentprovider's content is transmitted is A/T.

[0347] How Content Provider's Use the Data Tables

[0348] When bidding for content 902, content providers select a keywordor profile to target their content from Tables 12 & 13. The searchengine indicates automatically the number of times this search has beenperformed for the previous time period (H), the number of times items ofcontent were sent out associated with that selection (N) and the valueof the content P.

[0349] The new content provider then enters the selection factor A andthe system can then instantly calculate the new value (P) based on thenew total bids (T). The advertiser can also be told the number of viewsper month they are likely to get for their bid (N*(A/T)). These changesare calculated in real-time to give the new content provider anindication of how their bid will influence the value and the views theywill receive for their bid. If a value and number of views are agreeableto the advertiser they can choose to submit it as a bid for the definedperiod, such as a day, week, or month, for instance. The details ofother content providers are, preferably, not made public. Contentproviders may also enter a maximum value M they can part with for theircontent. This provides content providers with some security againstpaying too much if the value changes. If the value goes too high then acontent provider's bid can drop off the list (if P is greater than Mthen A is not counted as a bid for that particular content provider).The bid would go back on the list if the value went down again, thusacting as a stabilizing mechanism. The content provider can, in apreferred embodiment, be notified by e-mail if their content 902 hasdropped off the list due to their value limit M.

[0350] As shown by the content provider details table 812 of FIG. 24,for instance, content providers thus have an account with the searchengine proprietors and procedures for debiting their account for theircontent is automatically calculated from the account details on aperiodic basis. An electronic statement of the number of views, cost perview, number of click-throughs and cost per click-through for eachcontent provider is also forwarded to each content provider, since thisinformation is also stored in content provider details table 812 (Table14). In a preferred embodiment, it is possible to identify clusters ofsimilar keywords based on the keyword link table. The reason foridentifying clusters of keywords is so that content 902 can be targetedat groups of words rather than just individual words. The cluster forthe key-word “car” may include hundreds or thousands of words that havelinks to the word car (e.g. convertibles, automobiles, vans).Statistical clustering techniques are used to define the size andfrequency of key-word clusters. This makes it a much more automaticprocess than an editor deciding on clusters of keywords for contentprovider's to target.

[0351] The same system can be used to set values for keyword clusters.While grouping words in this way would incur an increased administrationcost, it is nevertheless computationally similar and only initiated oncea certain level of hits on a keyword had been exceeded.

[0352] Content only search Users can also purposely choose to searchonly the content provider associated with a keyword. In this case thesearch results will be based on the values of A in Table 12. The contentproviders that pay the most will be at the top of the list.

[0353] The key-word suggester can also help content providers choosekey-words or sets of key-words that they would like to display.

[0354] Controlling the Search Engine System

[0355] There are a number of parameters that can change the way in whichthe search engine according to the present invention ranks web pages.These factors (described in detail below) are:

[0356] History factor

[0357] This determines the rate of decay of the existing popular lists(popular hit list) as described in the text previously. This is a numberbetween 1 and 0. A high history factor will make it difficult to changethe existing popularity lists. As an example if the rate of searchingfor a particular keyword is increasing quickly, then the history factorshould be lower to enable emerging web pages to rise up the popularitylist.

[0358] Frequency of updating Table 3 from the cumulative surfer trace

[0359] This is a measure of the frequency with which the popularitylists are updated with information about the users' activities (i.e. thesurfer trace), for example, this may be measured once a day or even oncea month depending on the rate of change of popularity of particularkeyword searches.

[0360] Sampling frequency

[0361] This is the frequency of sampling the information of how usersare searching. If it is a common keyword it is not necessary to monitorevery search. It may be that only a percentage of all searches need bemonitored to accurately determine web-page popularity.

[0362] The composition of the default search list (mix of results fromthe new web-page list, high-flyers and popular-lists etc.)

[0363] The mix of web pages presented to the user as a default can bechanged if necessary to reflect the way in which search results evolveover time.

[0364] Content ‘hit factor’

[0365] The “content hit factor” is a measure of the weighting given to ahit on content being recorded as a hit for a keyword. The defaultsetting is that a hit on content counts the same as a hit from the listof web pages. The value of content hits can be set higher or lower thanunity, depending on the price of the content, e.g. the “content hitfactor” may need to be increased for valuable keywords as this woulddecrease the ability to spam these commercially valuable keywords. Thehigher the content factor, the higher the resistance to spam as thesearch results would be more dependent on price rather than popularity.

[0366] The time period for content bidding

[0367] Content providers bid a certain amount for a particular timeperiod e.g. one month. This time period may be different depending onthe rate-of-change of the price. If the price is changing rapidly or isvery stable, the time period may be respectively shortened or lengthenedcorrespondingly.

[0368] Number of key-words per web-page submission

[0369] This number could be changed to influence how the system learnsfrom new web pages submissions.

[0370] Length of time between accepting new-web-page submissions

[0371] If the date of submission for a web-page is too close to theexisting submission for that web-page, then it is not accepted. Thislength of time can be changed depending on any of the above factors

[0372] Number of searches per day, per person (IP address or user ID)that count as valid hits

[0373] This number can be changed to reduce the possibility of spamming

[0374] Length of time before renewing the security Table

[0375] The security Table that restricts abuse, notes the links betweenkeywords and IP addresses of user identifications. The length of timebetween refreshing this Table can be changed to make it harder to spamthe system.

[0376] The settings for these factors can be different for differentkeywords or groups of people depending on:

[0377] Frequency with which searches are done

[0378] The rate-of-change of frequency of searches

[0379] The price of the content

[0380] The rate of change of price of content

[0381] The precise setting of each of these factors will not be knownuntil the system begins operation ‘learning ’ about the users behaviors.The optimum settings for different situations may be determined byexperimentation.

[0382] Other Applications

[0383] Though the preferred embodiment has been described with referenceto a software useable on a computer network for searching the Internet,it will be appreciated that the invention may be readily applied to anysearch system where a human user chooses results from a set of initialsearch results. Such a system may for example be part of an, a LAN orWAN or even a database on an individual PC.

[0384] Examples of other possible areas of application for the presentinvention are described below.

[0385] Intranet Searches and Other Data Base Searches

[0386] Intranet searches at present suffer from similar drawbacks fromInternet searches, indeed some intranets can in themselves be extremelysubstantial systems, in which identifying a particular informationsource or item can be equally problematic. Utilizing the presentinvention in such applications is within the intended scope of thepresent invention.

[0387] Searching Other Media Forms

[0388] The present invention is also intended to be applied matching auser's profile to other media sources (such as pay per-view, television,videos, music and the like), thus allowing content targeted to aparticular audience. The same form of search lists as described above(Popular-list, High-flyers, Hot-off the press, etc) may be employed todirect users to appropriate material.

[0389] Shopping

[0390] The search techniques described herein can be implemented in aconsumer network to assist shoppers in selecting items from within oneshop or among a large number of shops. Instead of using a keyword-URLlink Table, there would be used a keyword-item purchased link Table,that then records what items were purchased after each shopping request(key-word). This embodiment also records where the user purchased theproduct. Each time a shopper purchased an item this would increment thepopularity of that item, using the same techniques described previously.

[0391] The profile type s in this embodiment can be used to record thetypes of purchases made by different sets of people. One could, forexample, select a profile type and see what are the most commonlypurchased items for a range of users, and would provide assistance inchoosing gifts for people who have a different profile type thanyourself.

[0392] Scientific Publications

[0393] Searching scientific data bases (on-line papers, journals, etc.)with the present invention will dramatically reduce the time spentexamining obscure, or esoteric areas only to find the informationirrelevant. The criteria for a valid hit for such uses would typicallyincorporate the extended time feature described above to establish theusefulness of the information source. The refereeing and referencing ofacademic/scientific papers using the present invention could enhanced byclassifying different levels or types of user, e.g. Dr, Professor etc.postgraduate, and so forth. This will enable users to see. for example,what information sources the eminent authorities in a particular fieldfound of interest. It would also allows the authors of a paper to becomeaware of how often their publication was accessed and possibly furtherindicate where and how often the paper was used as a reference insubsequent papers. Users may have to formally register with differentorganizations to obtain levels of ability to referee. Users may alsochoose the level of refereeing for their searching.

[0394] Online Help

[0395] There is currently a substantial global requirement for on-linehelp and support, particularly for computer/software applications. Sucha need would be considerably assuaged by use of the present invention asthe software developers obtain a direct feedback to the type andfrequency of particular inquiries, whilst the users receive theaccumulated benefit of the previous users. Different profile type swould enable the answers to be provided in an appropriate form for theuser, e.g. novice, expert, etc. The keyword suggester may, for example,suggest searching with key-words (questions) more likely to yield asatisfactory response. There can be a range of answers to each questionand as the system learns it will converge on to the best answers.

[0396] Question and Answer Services

[0397] Current On-line question/answer programs could be configured torun via the present invention, thus enabling answers to repeatedly askedquestions to be based on previous questions and similar questions to besuggested.

[0398] Content Optimization on Other Parts of the Internet

[0399] The same content bidding mechanism could be used to determine theprice of content for any location on the Internet, not just web pagelistings as identified above. In this embodiment, content providers willbid for a general content space to set the price automatically.

[0400] The profile type information from the search engine could be usedas a passport so that other advertisements on the Internet could be moretargeted to different audiences. This profile type information couldalso be used by web-page developers to customize their web-page fordifferent sets of users.

[0401] People Matching Service

[0402] In another embodiment, the system according to the presentinvention can be used as a dating service and/or a method for matchingpeople with similar preferences by doing a statistical analysis tocompare the individual preferences (Table 6) of groups of users. Theindividual past preference Tables, in this embodiment, would preferablybe normalized and compared to each other using a standard correlationcoefficient. When compared to other users it would give a numericalindication of how similar their preferences are.

[0403] The same embodiment could also be used to find information aboutsimilar people from there past preferences Tables. For example one couldask to be give the names of people in New Zealand with an interest inEcological Economics and a search could be made of the personalpreferences Tables. Such an embodiment, however, would typically includea password/consent indicator that provides consent of identified personsto give out their information, which consent could be given, forexample, in only certain circumstances, which circumstances are limitedto searchers who have a level of authority and password indicating thesame, or for persons who identify themselves with certaincharacteristics.

[0404] While the invention has been described in connection with what ispresently considered to be the most practical and preferred embodiments,it is understood that the invention is not limited to the disclosedembodiment. For example, each of the features described above can be usesingly or in combination, as set forth below in the claims, withoutother features described above which are patentably significant bythemselves. Accordingly, the present invention is intended to covervarious modifications and equivalent arrangements included within thespirit and scope of the appended claims.

We claim:
 1. In a computer network having a plurality of user sites, amethod of weighting the relative importance of a plurality of data itemsstored in a database on a server computer comprising the steps of:receiving at said server computer a keyword from a user site; generatingat said server computer a plurality of listings corresponding to saidkeyword, each listing also corresponding to one of said data items;transmitting from said server computer to one of said user sites saidplurality of listings; detecting at said server computer which ones ofsaid plurality of data items are selected by said user site, said usersite being transmitted each selected one of said data items uponselection of said corresponding listing by said user site; and updatingsaid database to weight said selected ones of said data items asrelatively more important than unselected ones of said data items withrespect to said keyword.
 2. A method according to claim 1, furtherincluding the steps of receiving, prior to said step of receiving saidkeyword, a password identifying a user; determining, using saidpassword, if said user is one of a selected group; and wherein said stepof detecting only occurs if said user is determined to be one of saidselected group.
 3. A method according to claim 1 wherein, associatedwith each of said data items, is an update date; and wherein said stepof generating generates said plurality of listings based upon data itemscorresponding to said keyword that have been most recently updated.
 4. Amethod according to claim 1 wherein, associated with each data item, isa recent weighting factor X and an old weighting factor Y; and whereinsaid step of generating generates said plurality of listings based upondata items that are increasing in popularity the fastest, as determinedusing said recent weighting factor X and said old weighting factor Y. 5.A method according to claim 1, further including the steps of receiving,prior to said step of receiving said keyword, a password identifying auser; and wherein said step of generating generates, for said keywordreceived, as said plurality of listings only user specific listingsassociated with said user, said user specific listings having beendetected in earlier ones of said detecting steps associated with saiduser.
 6. A method according to claim 1, wherein, associated with eachdata item, is a plurality of groups; and wherein said step of receivingsaid keyword also receives an identification of a first of said groups;and wherein said step of generating generates said plurality of listingsfrom only those data items associated with said first identified group.7. A method according to claim 6, wherein said step of receivingreceives an identification of a second of said groups; and wherein saidstep of generating generates said plurality of listings from only thosedata items associated with both said first and said second identifiedgroups.
 8. A method according to claim 1, further including the step ofdetermining if said keyword is permitted keyword; and wherein said stepof generating is only performed if said keyword is a permitted keyword.9. A method according to claim 1 wherein said step of detecting onlydetects each keyword one time from each user site during a determinedinterval of time.
 10. A method according to claim 9, wherein said stepof detecting each keyword one time includes the steps of: associating anidentifier with each user site; and using said identifier to trackkeywords that have been entered from each of said plurality of usersites.
 11. A method according to claim 9, wherein said step ofgenerating uses a history factor associated with each keyword indetermining said plurality of listings.
 12. A method according to claim10 wherein said history factor is a number less than or equal to 1 andgreater than or equal to
 0. 13. In a computer network having a pluralityof user sites and developer sites, a method of populating a database ona server computer comprising the steps of: entering a plurality of dataitems into said database from said developer sites, each of said dataitems entered into said database including as associated identifiers aplurality of associated keywords; and updating said database by enteringa plurality of user traces, each of said user traces identifying one ofsaid data items and an associated keyword so that each trace increasesthe relative importance of the associated data item with respect to saidassociated keyword.
 14. A method according to claim 13 wherein said stepof entering said plurality of data items includes as one of saidassociated identifiers one of a creation date and an update date.
 15. Amethod according to claim 13 wherein said step of entering saidplurality of data items includes as one of said associated identifiers adeveloper site identifier.
 16. A method according to claim 15 whereinsaid developer site identifier is used to prevent said developer sitefrom being used during said step of updating for said data items enteredby said developer site.
 17. A method according to claim 13 wherein eachof said user traces includes a user site identifier; and said user siteidentifier is used to update a user site table.
 18. In a computernetwork having a plurality developer sites, a method of determiningcontent to provide along with listings transmitted from a servercomputer to user sites comprising the steps of: obtaining a contentlisting from each of said plurality of said developer sites, each ofsaid content listings including content, a developer identifier, and akeyword, and a keyword selection factor; determining a particularkeyword from said obtained keywords that is the same for differentcontent listings; and using the keyword selection factor in determiningwhen to transmit said different content listings to said user sites. 19.A method according to claim 18, further including obtaining a profileand a profile selection factor for each content listing; determining aparticular profile from said obtained profiles that is the same fordifferent content listings; and using the profile selection factor indetermining when to transmit said different content listings to saiduser sites.
 20. In a computer network having a plurality of user sites,a method of weighting the relative importance of a plurality of keywordsstored in a database on a server computer comprising the steps of:receiving at said server computer an initial keyword from a user site;generating at said server computer a plurality of related keywordscorresponding to said initial keyword; transmitting from said servercomputer to one of said user sites said plurality of related keywords;detecting at said server computer which one of said plurality of relatedkeywords are selected by said user site; and updating said database toweight a relationship of said selected keyword and said initial keywordgreater than a relationship of said unselected keywords and said initialkeyword.